SPARQL in the cloud using Rya

نویسندگان

  • Roshan Punnoose
  • Adina Crainiceanu
  • David Rapp
چکیده

SPARQL is the standard query language for Resource Description Framework (RDF) data. RDF was designed with the initial goal of developing metadata for the Internet. While the number and the size of the generated RDF datasets are continually increasing, most of today’s best RDF storage solutions are confined to a single node. Working on a single node has significant scalability issues, especially considering the magnitude of modern day data. In this paper we introduce Rya, a scalable RDF data management system that efficiently supports SPARQL queries. We introduce storage methods, indexing schemes, and query processing techniques that scale to billions of triples across multiple nodes, while providing fast and easy access to the data through conventional query mechanisms such as SPARQL. Our performance evaluation shows that in most cases, our system outperforms existing distributed RDF solutions, even systems much more complex than ours.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vergleich und Evaluation von RDF-on-Hadoop-Lösungen

Mit der steigenden Anzahl von Daten, welche in Form des Resource Description Framework (RDF) veröffentlicht werden entsteht eine Menge von Daten, bei der Datenoperationen nicht mehr von einem einzelnen Rechner zu bewältigen sind. In dieser Arbeit werden Systeme vorgestellt, welche zur Lösung dieses Problems das Hadoop-Framework ausschließlich bzw. in Kombination mit anderen Big-Data-Frameworks ...

متن کامل

تجزیه و تحلیل مولفه های اصلی صفات کیفیت داخلی تخم مرغ و برخی از صفات عملکردی مرغ‌های بومی آذربایجان

   One of the main problems of multiple-trait genetic evaluation in poultry breeding is high computing costs. Principal components analysis (PCA) is a method for reducing the number of traits in correlated trait analysis. The aim of the present study was to determine the most effective principal components (PCs) of internal egg quality and some performance traits of Azarbayjan native chickens. ...

متن کامل

Scalable RDF Graph Querying Using Cloud Computing

With the explosion of the semantic web technologies, conventional SPARQL processing tools do not scale well for large amounts of RDF data because they are designed for use on a single-machine context. Several optimization solutions combined with cloud computing technologies have been proposed to overcome these drawbacks. However, these approaches only consider the SPARQL Basic Graph Pattern pro...

متن کامل

Sapphire: Querying RDF Data Made Simple

There is currently a large amount of publicly accessible structured data available as RDF data sets. For example, the Linked Open Data (LOD) cloud now consists of thousands of RDF data sets with over 30 billion triples, and the number and size of the data sets is continuously growing. Many of the data sets in the LOD cloud provide public SPARQL endpoints to allow issuing queries over them. Thes...

متن کامل

LHD: Optimising Linked Data Query Processing Using Parallelisation

In the past few years as large volume of Linked Data has been published, and processing distributed SPARQL queries over the Linked Data cloud is becoming increasingly challenging. The high data traffic cost and response time significantly affect the performance of distributed SPARQL queries as the number of SPARQL end point and the volume of data at each endpoint increase. In this context, para...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Syst.

دوره 48  شماره 

صفحات  -

تاریخ انتشار 2015